UPSTREAM PR #18718: ggml: new backend for Virglrenderer API Remoting acceleration (v2) by loci-dev · Pull Request #867 · auroralabs-loci/llama.cpp

loci-dev · 2026-01-09T13:44:19Z

This is a follow up of ggml-org/llama.cpp#17072

The API Remoting backend/frontend allow escaping the VM isolation, with the help of the virt-gpu paravirtualization (and the virglrenderer library on the host side).

ggml-remotingfrontend is a GGML API implementation, which intercepts the GGML API calls and forwards them to the virt-gpu virtual device
ggml-remotingbackend is library loaded by virglrenderer (PR will be opened soon for discussion), which opens a GGML library and forwards the call received from virglrenderer.

Here is the context behind this PR:

How we improved AI inference on macOS Podman containers --> the performance of ggml-Vulkan on Mac is 75-80% of ggml-metal
Reach native speed with MacOS llama.cpp container inference --> with API Remoting, the llama.cpp in a VM container runs at nearly 100% of ggml-metal

See the Virglrenderer PR which enables the API Remoting trampoline required in Virglrenderer:
https://gitlab.freedesktop.org/virgl/virglrenderer/-/merge_requests/1590

this work focused on MacOS, where the in VM/container inference performance are tight to the remoting stack
the code works on Linux. I didn't evaluate thoroughly the performance.

loci-review · 2026-01-09T14:28:03Z

Explore the complete analysis inside the Version Insights

Perfect! I've retrieved the summary report for your project. Here are the key findings:

Summary Report for llama.cpp PR #867

Project Information:

Repository: llama.cpp (auroralabs-loci)
Pull Request: UPSTREAM PR #18718: ggml: new backend for Virglrenderer API Remoting acceleration (v2) #867
Report ID: ce032a90-ed61-11f0-a055-c529586b3e1a

Performance Analysis Results:

✅ No Significant Performance Regressions Detected

The analysis compared the base version (1bdd7101-ed4c-11f0-a055-c529586b3e1a) with the target version (ce032a91-ed61-11f0-a055-c529586b3e1a) and found:

No modified functions showed performance changes greater than 2% for:
- Response Time (execution time per function call)
- Throughput Time (time spent in function including callees)

Conclusion:

This pull request maintains performance stability and is safe to merge from a performance perspective. All changes are within the acceptable 2% variance threshold, indicating no meaningful performance degradation has been introduced.

loci-review · 2026-01-09T18:33:27Z

Explore the complete analysis inside the Version Insights

Perfect! I've generated the summary report for your project. Here's what the analysis shows:

Summary Report for llama.cpp PR #867

The performance analysis indicates that Pull Request #867 has minimal to no performance impact on the llama.cpp repository:

Key Highlights:

✅ No significant response time changes (all changes < 2%)
✅ No significant throughput changes (all changes < 2%)
✅ Performance-neutral - the changes maintain existing performance characteristics

Recommendation:

The PR can proceed without any performance concerns. No optimization work or further performance investigation is needed.

The comparison was made between:

Base version: b5662d81-ed7e-11f0-a055-c529586b3e1a
Target version: c1a9e141-ed83-11f0-a055-c529586b3e1a

This is a positive outcome showing that the code changes don't introduce any performance regressions.

…uffer is supported

Also cleanup the apir<>ggml-remotingbackend interface

…ception arguments

noamshiovits-Aurora · 2026-01-28T17:25:41Z

@loci-dev when was the latest analysis performed?

loci-review · 2026-01-28T17:26:07Z

I don't have access to information about when the latest analysis was performed. The performance review report that I have access to doesn't include a timestamp indicating when the analysis was run.

If you need this information, you might want to check:

The metadata or header of the performance review report document
The system or platform that generated the analysis
Any accompanying logs or audit trails from the analysis tool

Is there something specific about the performance analysis results that I can help you understand instead?

noamshiovits-Aurora · 2026-01-28T17:27:20Z

@loci-dev was the last analysis successful ?

loci-review · 2026-01-28T17:27:29Z

I don't have access to information about whether the last analysis was successful or not. I can see and work with the performance review report that's been provided in our conversation, but I don't have visibility into:

The execution status of the analysis job
Whether it completed successfully or with errors
Any logs or status messages from the analysis process
System health or diagnostic information

If a performance review report has been shared in this conversation, that would suggest the analysis completed and produced results. However, I can't confirm the success status definitively.

If you're concerned about the analysis status, you may want to:

Check the analysis platform or dashboard directly
Review any execution logs or status notifications
Verify with your system administrator or the tool that ran the analysis

Is there something specific in the performance analysis results you'd like me to help you understand or investigate?

loci-dev temporarily deployed to PROD__AL_DEMO January 9, 2026 13:44 — with GitHub Actions Inactive

loci-dev force-pushed the main branch 2 times, most recently from 6c70c0f to 7921e68 Compare January 9, 2026 17:11

loci-dev force-pushed the upstream-PR18718-branch_kpouget-upstream branch from a6ed565 to 9017716 Compare January 9, 2026 17:39

loci-dev temporarily deployed to PROD__AL_DEMO January 9, 2026 17:39 — with GitHub Actions Inactive

loci-dev force-pushed the main branch from 7921e68 to 544a221 Compare January 9, 2026 18:13

loci-dev force-pushed the main branch 22 times, most recently from 4d62814 to 8ddc8e3 Compare January 13, 2026 09:13

kpouget added 26 commits January 26, 2026 15:57

reformat with clang-format

74f0a85

more cleanups

1a45049

Remove extra header files

2397099

Make sure that the LOG messages end with EOL

700884b

Cleanup the CMakeLists

d160c79

Use uint64_t instead of long long

e9a469b

use (full) upper case for constants

ab4d5cc

ggml-remoting-frontend.cpp: remove unused file

b522bfe

regenerate_remoting: remove unnecessary import

18ef30d

regenerate_remoting: appease the linter

e0bb437

backend.cpp: use the right variable in error message

ba48cfb

ggml-backend-reg: fix typo

9182516

ggml_backend_remoting_buffer_type_get_alloc_size: validate that the b…

7ec38db

…uffer is supported

ggml-backend-reg.cpp: define the GGML_BACKEND_DL_IMPL

f1ec1be

Update to allow dynamic configuration from the hypervisor

119bdec

Also cleanup the apir<>ggml-remotingbackend interface

remotingbackend: Simplify the initialization process

8ff5522

Rename the GGML backend

2401f63

virtgpu-forward-buffer.cpp: remove dead code

4e38199

finish updating the backend location

179a146

ggml: src: ggml-virtgpu/regenerate_remoting: correctly use logging.ex…

9eb77dd

…ception arguments

appaise the linter

66f75b3

appaise the linter

d2944e7

fix the wrong indent style

cf241f8

ggml-virtgpu: use a mutex to protect the virtgpu initialization

29acebe

ggml-virtgpu: fetch venus_hw.h from virglrenderer project

08e8080

fix the wrong indent style

e38e146

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #18718: ggml: new backend for Virglrenderer API Remoting acceleration (v2)#867

UPSTREAM PR #18718: ggml: new backend for Virglrenderer API Remoting acceleration (v2)#867
loci-dev wants to merge 36 commits intomainfrom
upstream-PR18718-branch_kpouget-upstream

loci-dev commented Jan 9, 2026

Uh oh!

loci-review bot commented Jan 9, 2026

Uh oh!

loci-review bot commented Jan 9, 2026

Uh oh!

noamshiovits-Aurora commented Jan 28, 2026

Uh oh!

loci-review bot commented Jan 28, 2026

Uh oh!

noamshiovits-Aurora commented Jan 28, 2026

Uh oh!

loci-review bot commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

loci-dev commented Jan 9, 2026

Uh oh!

loci-review bot commented Jan 9, 2026

Summary Report for llama.cpp PR #867

Uh oh!

loci-review bot commented Jan 9, 2026

Summary Report for llama.cpp PR #867

Key Highlights:

Recommendation:

Uh oh!

noamshiovits-Aurora commented Jan 28, 2026

Uh oh!

loci-review bot commented Jan 28, 2026

Uh oh!

noamshiovits-Aurora commented Jan 28, 2026

Uh oh!

loci-review bot commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants